DOCUMENT .RESUME 



ED 084 909 



FL 004 533 



AUTHOR 
TITLE 



PUB DATE 
N.OTE 

EDRS PRICE 
DESCRIPTORS 



Meeker^ Mary; Meeker^ Robert 

Strategies for Assessing Intellectual Patterns in 

Blackr Anglor and Hexican^-American Boys — or Any Other 

Children — and Implications for Education. 

[73] 

32p. 



IDENTIFIERS 



MF-$0,65 HC-$ 
Anglo America 
Cultural Fact 
Testing ; *Gro 
Quotient; *In 
♦Minority Gro 
Tests ; Spanis 
Test Construe 
Reliability; 
♦Stanford Bin 



3,29 

ns ; Aptitude Tests; Cognitive Tests; 
ors; Culture Free Tests; *Educational 
up Intelligence Tests; Intelligence 
telligence Tests; Mexican Americans; 
up Children; Negro Youth; Prognostic * 
h Speaking; Student Testing; *Test Bias; 
tion; Test Interpretation; Test 
Test Validity 
et Intelligence Test 



ABSTRACT 

In this analysis of intelligenc 
group children, the implications of inadequate 
discussed. Several aspects of test design are e 
in intelligence testing, cultural bias, constru 
diagnostic utility. A sample set pf results der 
Stanf ord-Binet test administered to 257 respond 
statistical data are included. The author concl 
"investigations of cultural biases in intellige 
established the fact that the most widely used 
•penalizing' for non--Anglo, lower socioeconomic 
cautioned of the dangers in using group-test re 
programs geared to individual needs. (RL) 
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Today it is common practice for those who speak on 
behalf of disadvantaged students to be against 10 testing in 
the schools. Tiie basis of their opposition is a familiar 
scenario: A student is tested; the test may be, for a var- 
iety of reasons, inaccurate; if so, the test results typically 
cliaracterize tiie student as below normal; the' characteriza- 
tion functions as a label wnich prompts attitudes and treat- 
ments that are subtly (or otherwise) transmitted to the stu- 
dent; the student, in turn, tends to reflect behavior that 
fulfills the expectation.. In short, if the psychologist tells 
the teacher that Johnny is below average, the teaciier will 
treat him as such, and Johnny will respond accordingly, all 

I'Q because the practice of testing is little related to the rea- 

son for testing. 

i\o one doubts that this scenario has been, and still 
is, being played out countless times in schools throughout 

^ the nation. The situation, however, is not at issue; the 



issue is v/hat to do about it. To understand tlie issue better, 
we need to follow the scenario one step further. 

Once the problem is ackno^vledged , a further dialogue 
ensues, in which there are four principals: those v/lio repre- 
sent the disadvantaged, those who are directly concerned with 
instruction, those who are involved in test construction, and, 
in the center of this dialogue, the school psycJiologis t • Rep- 
resentatives of tlie disadvantaged are advocating the abolition 
of all IQ testing ; ^ psycliometricians are analyzing the sources 
of test inaccuracies, and discussing means for making tests 
more valid, more reliable and culture - free ; and, close at 
hand, the classroom teachers are requesting professional assis- 
tance for their daily encounters. Tlie school psychologist 
must respond to all of them, and although it is the response 
to the classroom teacher that is critical, it is necessary to 
exrimirie each response in turn. 

[; To the representative of disadvantaged: The proposed 

abolition of IQ testing is well- in tended , but ill-conceived ; 
the abolition of all testing would only serve to drive the 
phenomenon underground. To deny teachers access to formal 
assessments is to force them to make informal assessments that 
are subject to the same sorts of deficiencies , which, not 
being exposed to scrutiny, are not open to easy identification 
and correction. (This also denies the need for formal as^^ess- 
ments that some students must have to meet legislated qualifi- 
cations in many special programs.) 



To tiie psyciiometrician : Analyses of latent sources o£ 
testing deficiencies are helpful, and the proposed programs 
for rectification are welcomed, but operations cannot be sus- 
pended KJiile we wait for more reliable and valid instruments. 
The job of constructing better tests is an exacting task that 
takes. time, but the educational need is always immediate and 
on-going, so the best available instruments must be used. 

The school psycliologist is, then, in a position where 
lie or she can neither forfeit nor defer responsibility. That 
does not mean doing business as usual; mrny of the cited 
injustices of the present practices can be eradicated by mak- 
ing some changes in the professional services provided. To 
best understand the nature and importance of these changes, 
we need to reexamine the dynamics of the opening scenario , 
The inescapable conclusion drawn from that scenario is that 
deficient assessments are almost uniformly insufficient, if 
not detrimental, for tlie ensuing instruction. It seems obvi- 
ous tliat poor diagnosis would produce dysfunctional treatment, 
but we might look at the situation more closely, and rather 
than focus on the potentiality of error, we can ask vihy the 
effect of error is so overwhelming. 

Upon reflection, one must wonder how a gross deficiency 
in assessment--and, by. the nature of the case, the inaccuracy 
must be more than marginal- -how^ an error of such magnitude 
can be perpetuated. Why, in other words, don't the child's 
strong abilities naturally assert themselves and thereby right 



tae situation? Consider the analogous situation in medicine: 
Doctors make diagnoses, and they are not al\vays accurate, but 
hospitals are not full o£ healthy patients who have been mis- 
takenly identified as sick. 

Tlie analogy is misleading, but instructive for its 
contrast; there are tw^o essential points of difference: First 
medical diagnosis is specific, and second, treatment is con- 
sidered an extension of diagnosis in the sense that a patient' 
response to treatment is part of a continuous and reflexive 
process of diagnostic review. Th.is latter, of course, does 
not-liappen in tiie schools. To underscore these points of 
contrast, we need only recast the opening scenario in medical 
terms: The doctor diagnoses a referral as irierely sick or 
below average healtli, passes this global assessment on to the 

treatment personnel who, regarding the patient as "sick,^' put 

•J . ; 

him under general hospital care (no prescribed treatment)-- 

under these circumstances, the patient's general health may ^ 
never show enough improvement to be discharged. 

To bring this back to the main line of discussion, the 
problem is not the potentiality of. error, so much as the fact 
that the assessment is so general in kind that, as a conse- 
quence, it bears little relation to treatment. If assessments 
are specific and prescript ively related to treatment , then 
there is considerably less chance that errors, when they do 
occur, will be perpetuated. . ■ 



All of this serves as an extended prologue to consid- 
ering the sciiool psychologist's resi)onse to requests from 
classrooiii teachers for professional a:?sistance. Tlie response 
cannot he general, glohal , and unrelated to instructional 
treatment. In .the study to follow we suggest one, thougii 
certainly not the only, mea:.5 of being specific and prescrip- 
tive. 

Deficiencies in 
Intelligence Testing 

To provide ,a proper framework for our approach we 
need' to look at the deficiencies of intelligence testing in 
tiie most general terms possii:)le. If we ask the question, 
^'What's wrong with intelligence testing?*^ we find researchers 
responding to three different aspects of the problem: (1) 
Cultural bias--tests (^hd. test administration) are inadequate 
because they are predicated on a cultural norm that is penal- 
izing to those outside the norm. (2) Construct validity--, 
tests are inadequate because they systematically exclude 
important aspects of intelligence. (3) Diagnostic utility-- 
tests are inadequate because they fail to provide adequate 
information for treatment. Those three aspects of the"^ prob- 
lem are clearly distinct. There could be, for instance, cul- 
ture fair tests that were conceptually invalid, and there 
could be conceptually valid tests that were diagnostically 
sterile--any combination is possible, and, importantly, it is 



generally conceded that the most widely used tests (and test- 
ing procedures) are, generally deficient in all three 
respects. 

Cultural Bias . The importance of cultural concomi- 
tants in interpreting intelligence test results is evident in 
a study by Mercer, et al . (1972). They found a direct rela- 
tionship between cultural background and IQ test measures. 
Undifferentiated test results for Chicane and Black children 
were (on an average for both groups) aHout ten points below 
average. These undifferentiated results are similar to what 
other investigators have found; the importance of the Mercer, 
project is that they' could account for this below- average per- 
formance by ^'exogenous ^' cultural concomitants. Applying a 
five-factor index of '^Anglicized*' culture: (1) the motlier 
wants the child to have an education beyond high school, (2) 
the parents are married, (3) the family are home owners, (4) 
the father has a skilled job, and (5) the family is relatively 
small and intact- -'the IQ scores were grouped according to the 
degree of "Anglized'' cultural background. The differentiated 
average for the Anglo (score of 5) was average or slightly 
above average; on the other hand, members of the least Angli- 
cized group (score of 0).were a standard deviation below the 
norm. This pattern of results Iield for both Blacks and Chi- 
canes. The penalizing effects of a non- Anglo background are 
obvious and conclusive in this study. 



'Hie same geiicral [)oint has been underscored and ampli- 
fied in -many otiier studies. On a slightly different, but 
hig'jily related line of investigation, a number of researchers 
(Pasamanick and Knoblock, 1955; Bloom, l'r)64 ; Bereiter, 1965; 
Gray and Klaus, 1965; Lesser, Fifer, Clark and otiiers , 1965) 
liuve been concerned v;ith tlie effects of examiner bias. Gen- 
erally they have found tiiat a cultural difference between 
examiner and examinee jias an adverse effect on tiie resulting 
IQ score. Tlie focus of tiiese investigations is not blatant 
prejudice on tl^ie part of tlie examiner- - tnat v/ould be easily 
detected--but rather on the subtle effects of the rapport and 
language necessary for adequate responses for a power test of 
intelligence. In all, these studies underscore a significant 
problem in the procedures of traditional standardized intel- 
ligence testing. 

Otiier investigations iiave been concerned with the 
nature of the intervening variables that might- serve to explain 
wliy tests are culturally biased. In other words, granting the 
fact of cultural bias , a number of studies have concentrated 
on identifying the characteristic differences in test perfor- 
mances that might account for the cultural bias. Typical of 
this line of investigation is the longitudinal study of 
Hertzig, Birch , Thomas , and Mendez (1968). They ^^amplified" ' 
the normal mode of intelligence testing--in addition to the 
usual recording of right and wrong Binet answers, with each 
child using his preferred language, more detailed observations 



of examinee responses were niacle. Lacii response was ciiarac- 
terized as verbal or non-verbai , .and furtiier classified as to 
its elaboration (i.e., vrhether tiie response .v;as limited to 
tiie expected one or was spontaneously extended or explained). 
They compared tiie response s from PueTto Kican chi Idren of 
lov/er-class blue-collar workers witli responses from Anglo 
children of middle-class professionals. Tlie Anglo children 
were significantly more verbal and elai^orating in their 
responses. For tliose who know the response-dynamics of the 
testing situation, it is not unreasonable to conclude that 
the -cultural ly related differences in mode of response would 
account, at least in some measure, for the culturally related 
differences in test performance. It is possible that cliildren 
\\'ho are able to elaborate, even in a trial-and-error guess, 
have a better chance at arriving at an acceptable answer. 

To summarize: Investigations of cultural biases in 
intelligence testing have established the fact that the most 
widely used tests and test procedures are '^penalizing^' for 
non-Anglo, lower socioeconomic groups. 

Construct Validity . A second approach to the general 
issue of testing deficiencies is concerned w-ith what intelli- 
gence tests are, in fact, measuring. In otner words, this 
line of investigation questions whetlier the most widely used 
tests (especially the'Binet, a power test, and the WISC, a 
•speeded test of intelligence) are adequate as instruments of' 
measurement. They obviously measure something, but is it tlie 



whole, or even the most significant aspect of intelligence? 
While the construct validity of any testing instrument is 
always (in principle) open to question, the field of intelli- 
gence testing presented a situation v/here, for all practical" 
purposes, IQ scores and intelligence had become (and are 
.still considered by many to be) synonymous. A number of 
researchers have been concerned to break the '^set^' of a uni- 
dimensional, static concept of intelligence. 

One line- of investigation has been to question the 
assumed constancy of intelligence, (This research has, of 
course, been prompted, influenced, and guided by the work of 
developmental psychologists, preeminently Piaget.) As an exam- 
ple, the HcCall, Hogarty, and ilurburt study (1972), at the Pels 
Research Institute, made a longitudinal study of general Binet 
IQ scores. Their investigations underscore, the importance of 
the development aspects of intelligence, i.e., that a general 
index of intelligence does not hold constant for the same 
respondent over time. To. quote their summary and conclusions: 

The most pronounced trend spanning the entire infancy 
period involved the manipulative exploration of objects 
that produced perceptual contingencies at 6 months, the 
imitation of simple fine motor and elementary verbal 
behavior particularly in a social contact at 12 months, 
• verbal labeling and comprehension at 18 months, and ver- 
bal fluency and grammatical maturity at 24 months. 

Moreover, to label as '^mental," performances at ' 
every age perpetuates the belief in a pervasive and devel- 
opment ally constant intelligence . Consequently , the term 
mental as applied to infant behavior or tests should be 
abandoned in favor of some conceptually more neutial label, 
perhaps Piaget ' s ^'sensorimotor , ''perceptual -mo tor , *' or 
even more specific classes of behaviors (e . g ., exploration 



of perceptual contingencies, imitation, language) , The 
net\vork of trans it ions between skills at one age and 
another is likely more specific and complex tlian once 
thought, and not accurately subsumed under one general 
concept. 

Psychome tricians liave also questioned tlie (presumed) 
adequacy of a unidimens ional index of inteliigence . As early 
as the 1930's W. P. Alexander (1934), after Thurstone, found 
that general intelligence accounted for only 10% of success 
in sliop achievement (spatial ability accounted for 13%, moti- 
vation for 48%, and 34% remained unaccounted). Research into 
specific intellectual abilities (as contrasted with general 
ability) has been developing ever since, 

riost notable among these developments has been the 
work of-Guilford and his associates (Guilford, 1956). Using 
factor-analytic techniques, they found sets of distinct Intel 
lectual abilities which could be conceptualized along three 
dimensions, which they referred to as the Structure of Intel- 
lect. (An elaboration of the theoretical SI model by Meeker 
was named the SOI for purposes of application; the schema is 
given later in this article,) Subsequent to this pioneering 
ivork, which used adult males as the subject population, other 
investigators (rieeker, 1963; Meyers, et al., 1964 ; Orpet and 
Meyers, 1966; Sitkei, 1966; Ball, 1972 [see her contribution 
elsewhere in this journal]) have, found similar factors among 
normal, mentally retarded, physically handicapped, and gifted 
children. 



Tne inadequacy of a general index of intelligence 
seems apparent and undoubtedly the trend toward greater dif- 
ferentiation will continue. .'sonetheless , the instruments of 
general assessiu'ent will not be quickly or easily displaced 
in tlie school context for two reasons: First, the instruments 
are familiar to practitioners and they are , | undeniably , statis- 
tically sound. Second, tliere is, at present, no practical 
substitute for the Binet and WISC; i.e., there are no differ- 
entiated abilities tests (group or individual) that can be 
used within the limits of time and personnel that are normally 
allocated to testing. In other words . general intelligence 
instruments, althougii inadequate, will find continued use as 
long as there are not practical specif ic- abilities tests 
available. 

Diagnostic Utility . Diagnostic utility is, as. the term 
implies, a practical consideration relating to a test's ade- 
quacy. Evaluations of utility are always made relative to 
some operational context. Obviously, in the present case, 
evaluation of diagnostic utility is being made with reference 
to the school context. 

Two general points about diagnostic utility deserve 
comment. First, it is a legit imate . concern . True, those who 
are theoreticians or pure researchers may not acknowledge the 
legitimacy of diagnostic utility as a criterion of test ade- 
quacy. They may make this judgment for themselves, but they 



cannot presume to impose this judgment on those who use tests 
for diagnostic purposes. And, it would be obtuse for those 
-who have diagnos t ic respons ibil ity to disregard any test's 
diagnostic potential. Second, diagnostic utility should not 
be confused v/ith predictive validity. P test's predictive 
validity is measured by its accuracy in predicting performance 
in non-test situations; a test's diagnostic utility is evalu- 
ated by its usefulness in prescribing effective treatment 
o"* intervention (such as, for example, reading tests which 
diai,,"^ose problem areas for the purpose of remediation). Gen- 
eral ii'telligence tests have high predictive validity for 
school pev'f ormance , but they are nearly useless as a basis 
for prescribing treatment. Generally, if a test is being used 
as a screening 3evice , one looks for predictive validity; but 
if a test is being used as a guide for treatment, one looks 
for a test with diagr>ostic utility. The distinction is criti- 
cal to the whole issue of intelligence testing as it relates 
to the disadvantaged; the fact that the tests, as currently 
used in tlie schools, have high predictiv^ v^alidity isi, in a 
sense, the problem: . as screening devices they work all too 
well; as diagnostic instruments they are, if left as is, actu- 
ally dysfunctional. 

For a test to have, diagnostic utility it must be spe- 
cif ically and differentially related to treatments or inter- 
ventions that are, practically, within the diagnostician's 
domain of control. Of the two criteria, the first needs little 



claborat iofi . The iiiorc s])ecific a J i cigiiosis , tliC more specific 
the prescriptive treatiiiofit can be, and, consec[uently , the more 
exact tne evaluation of tiie treatnent process. Tlie second 
point, the need for dia.i;nost i cs to relate to tiie ctica] 
domain of control, deserves iiiorc elal)orat ion . [n tiic abstract 
it may seem obvious that j f an instrument points to variables 
outside the diagnostician's domain of control, little practi- 
cal use can !.)e made of tlic information. K'nowiiig that x- factor 
is related to y-ailraent is useful for intervention only if 
X- factor can be controlled, manipulated, checked, or otherwise 
effected. For tliis reason, tiie diagnostic utility of SES- 
concomitant assessinent:^ would seem to be very limited; the 
fact tiiat SES is a determinant of test performance leads no- 
where in terms of direct prescriptive treatment since the 
socioeconomic status of the student is outside tiie domain of 
control for the school. (It may, of course, serve to caution 
the diagnostician not to take the test score at face value, 
but beyond tliat it provides little direction for treatment.) 

The most v/idely used tests (tJie Binet and the WISC) 
iiave very limited diagnostic utility; as measures of general' 
intelligence they offer little guidance for prescriptive 
treatment. As a practicdl and interim (^""^11 specific abili- 
ties test;5 can be developed*) remedy for ti;is situation, 
ileeker (1963, 1969) iias proposed a metliod for using Binet 
(or WISC) responses to derive differentiated assessments of 

*Sucli a project is iiow in progress. 

ERLC 



SOI abilities. This method has been used extensively in 
studies by Meeker (1965), Feldman (1970), Brown (1971), 
Karradenes (1.971) , Hays and Pcriera (1972), Hess (1972) and 
Manning (1972). The study report that follows is illustrative 
of tlie potential diagnostic utility afforded by differentiated 
indices of intelligence. 

STUDY METHOD This study is based on item- response data from 
Stanf ord-Binet tests administered to 257 respondents. Using 
a technique described elsewhere (Meeker, M., 1963), the item 
responses were tallied according to the Structure-of- Intellect 
schema (see Fig. 1). All subjects were boys who resided in 
innercity Los Angeles urban communities. 

(Insert Fig. 1 about here) 

Respondents v/ere from one of seven groups: 
^(1) MAS (4-5) Mexican-Americans , age. 4 to 5, who took 
their tests in Spanish with a Mexican-American examiner. 

(2) MAE (4-5) Mexican-Americans, age 4 to 5, who took 
their tests in English; they spoke English and their parents 
spoke English. An interpreter ,' when needed, was presenc in 
each examination. 

(3) MAE (7-9) MexicanrAmericans , age 7 to 9 , who took 
their tests in English; they spoke English and their parents 
spoke. English . * 

*It was not possible to complete a sample of MAS (7-9) 
to contrast and compare with MAS (4-5). 
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(4) 3'(4-5) Blacks, age 4 to 5 , tested in English 
by Black examiners. 

(5) B (7-9) Blacks, age 7 to 9 , tested by Black 
examiners . 

(6) . A (4-5) Anglos, age 4 to 5, tested by Anglo 
examiners. 

(7) A (7-9) Anglos, age 7 to 9, tested by Anglo 
examiners. 
Sample Description 

Age Range IQ Range IQ Mean Sex Number 
MAS- 4- 5 4.9-5.9 
MAE-4-5 4.9-5.9 
.MAE- 7 -9 7.0 -9.11 

BLACK-4-5 4.9-5.9 
BLACK- 7"-9 7.0-9.11 
ANGLO-4-5 4.11-5.9 
AiNGLO-7-9 7.0-9.11 

One condition of the 4-5 year old sample was that none 
had had any formal preschool education; that is, none liad been 
in Head Start, nursery, or coop preschool. It was our intent 
to try to get SOI-Binet profiles on the 4-5 year olds in an 
attempt to have a sample of entering kindergarteners who were 
"uncontamiaated'' by formal education. 

We wanted to see what kinds of SOI abilities boys come 
to school with when they have had limited exposure to learning 
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of an acadeiric nature. The reason for selecting the corapara- 
ble age 7-9 group ivas to see what, if any, changes occurred 
in their SOI abilities due to exposure to* traditional sciiool 
learning . 

The group identity of the respondents v;as retained in 
the tally of each item- response ; as a result, each datum is. 
characterized by a five-way classification;. 

GROUP--:iAB(4-5) , f.:AS(4-5), :[AIi(7-9), B(4-5), B(7-9) , 

A(4-5), A(7-9). 
OPERATION- -Cognition , Memory , Evaluation , Convergent 

and Divergent Productions • 
CONTENT^-Figural , Symbolic-, Semantic. 
PRODUCT- -Units , Classes , Relations , Systems , Trans- 
formations, Implications. 
SCORE--Correct, Incorrect. 

The five-way classification yields a potential data 
space of 1260 cells; the sampling distribution in the d ata 
space was too irregular to support a full multi-classification 
analysis, so each of the major SOI dimensions was analyzed 
independently (with consequent loss of information pertaining 
to between-dimension interactive effects). Multi-classification 
analyses for each of the SOI dimensions showed highly signifi- 
cant differences. 

GROUP X OPERATION X SCORE = 101.6457 df = 24 p<.0001 

GROUP X- CONTENT X SCORE = 154.1713 df =12 p<.0001 

GROUPX PRODUCT X SCORE ^ x^ = 170.044 df = 30 p<.0001 



Each of the above relationsliips was further analyzed 
with regard to the within-group and be tween-group effects. 
These results are of greatest interest for the present study 
since they afford two kinds of comparisons . The within-group 
analyses reveal general strengths and wea]-*nesses profiles for 
each group, while the between-group analyses serve to anchor 
these evaluations in relationships to other groups, and, by 
implications, to the general population. In other words, if 
a group shows particular strength in, say, cognition (among 
the operations) , that fact in itself would be helpful in plan- 
ning instructional programs, and if, in addition, the group 
also shows strength in cognition in comparison with other 
group? , this would serve to reinforce the evaluation. Thus, 
in interpreting the results we look primarily to the within- 
group analyses- since they are most useful for instructional 
prescriptions, and we look secondarily to between-group anal- 
yses as a means of anchoring the group ability profiles. 
Summaries, of the within- and between-group analyses for each 
of the major dimensions of the SOI are presented in Tables 1, 
2 , and 3 . - / 

Insert Tables- 1, 2, and 3 about here; 



Discussion . We offer this study as an illustration 
of the potential utility of specific ability assessment. Beyond 
that, we eschew group-oriented interpretations as generally 
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dys funct ioiKi ] for educational practice. IVhilc group results 
inigiit have limited utility for general ins tructional plan- 
nin^l, it should ho patently ohvious tliat an individual stu- 
dent's profile of abilities on any or all of the SOI dimen- 
sions may be vastly different from his group ^s profile on any 
or all of the SOI dimensions. As obvious" as this may be sta- 
tistically, one nonetheless finds,. in instructional practice, 
that group-type diagnoses are used as bases for prescribing 
individual treatment. We explicitly disown any such use that 
might be made of the results; indeed, the larger point at 
issue--that specific, treatment- related , individual assessment 
is an immediate remedy for intelligence testing abuses - -would 
be subverted by using group-oriented data as a substitute for 
individual diagnostics. 
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